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Abstract 

Consider a system of N identical hard spherical particles moving in a d-dimensional 
box and undergoing elastic, possibly multi-particle, collisions. We develop a new 
algorithm that recovers the pre-collision state from the post-collision state of the 
system, across a series of consecutive collisions, with essentially no memory over- 
head. The challenge in achieving reversibility for an n-particle collision (where, in 
general, n <C N) arises from the presence of nd — d— 1 degrees of freedom (arbitrary 
angles) during each collision, as well as from the complex geometrical constraints 
placed on the colliding particles. To reverse the collisions in a traditional simulation 
setting, all of the particular realizations of these degrees of freedom (angles) during 
the forward simulation must be tracked. This requires memory proportional to the 
number of collisions, which grows very fast with JV and d, thereby severely limiting 
the de facto applicability of the scheme. This limitation is addressed here by first 
performing a pseudo-randomization of angles, which ensures determinism in the 
reverse path for any values of n and d. To address the more difficult problem of 
geometrical and dynamic constraints, a new approach is developed which correctly 
samples the constrained phase space. Upon combining the pseudo-randomization 
with correct phase space sampling, perfect reversibility of collisions is achieved, as 
illustrated for n < 3, d = 2, and n — 2, d = 3. This result enables, for the first 
time, reversible simulations of elastic collisions with essentially zero memory accu- 
mulation. In principle, the approach presented here could be generalized to larger 
values of n. The reverse computation methodology presented here uncovers impor- 
tant issues of irreversibility in conventional models, and the difficulties encountered 
in arriving at a reversible model for one of the most basic and widely used physical 
system processes, namely, elastic collisions for hard spheres. Insights and solution 
methodologies, with regard to accurate phase space coverage with reversible ran- 
dom sampling proposed in this context, can help serve as models and/or starting 
points for other reversible simulations. 
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1 Introduction 



1.1 Background 

Modeling and simulation of particle collisions for a wide range of collision types has been 
a subject of scientific interest for a long time. However, while the forward simulation of 
collisions is relatively well understood, their reversible simulation is much less so. Here, 
we study the problem of modeling and simulating elastic collisions reversibly. 

In a system of TV identical hard spherical particles colliding with each other in a 
d-dimensional box, every elastic collision, in general, yields an underspecified system of 
equations and inequalities [TM80]. The underspecified nature of the system gives rise 
to interesting challenges for reversibility, such as the issue of memory accumulation and 
the need for unbiased phase space coverage. 

In each multi-particle collision of n <C N hard particles in d dimensions, the velocities 
represent nd variables in the post-collision state, which are related to the nd velocities 
in the pre-collision state via conservation laws. Upon applying conservation of momenta 
and kinetic energy, one is left with nd — d — 1 unspecified degrees of freedom. Deter- 
ministic solutions to this underspecified system add new constraints that correspond to 
specific additional assumptions on the collisions. For example, in a 2-particle collision, 
determinism may be induced by exchanging velocity components along the line joining 
the centers of the two particles [Lub91, Mar97a, Mar97b, HBD+89]. Non-deterministic 
solutions can be obtained by infusing the appropriate amount of randomization needed 
to cover the phase space of the underspecified system in a complete and unbiased man- 
ner. Of particular concern, is accounting for geometric constraints, e.g., disallowing 
particle-overlapping at all times. 

The goal of our work is to develop a modeling and simulation framework which 
accurately and completely recovers the pre-collision state from the post-collision state 
of all particles involved in every collision in a sequence of collisions in the system, with 
essentially no memory overhead. 

1.2 Organization 

In Section 2, the problem is defined and terminology is introduced. The basic outline 
of our approach to reversal of elastic collisions is outlined in Section 3. This is followed 
by Section 4 which gives detailed reversal algorithms for 2-particle collisions (up to 
3 dimensions) and 3-particle collisions (up to 2 dimensions). Results from experiments 
with a software realization of the algorithms are presented in Section 5. An estimation of 
the potential performance gains from reverse computation, in comparison to conventional 
state saving approaches, is given in Section 6. The findings are summarized in Section 7. 

2 Problem Definition and Terminology 

2.1 Reversibility Problem 

Consider a system consisting of ./V identical hard spheres of diameter D in a d dimen- 
sional cubic box undergoing elastic collisions among themselves and/or with the walls 
of the box. Reversible simulation of particle collisions in the system means that, at any 
moment, the simulation can be stopped and executed backwards to arbitrary points in 
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the past, potentially all the way to the initial state, such that the positions and veloc- 
ities of particles are restored to the same values that the particles had at the chosen 
point in the past. The problem is to achieve this reversible simulation with minimal or, 
ideally, no memory overhead. The reversal scheme must support starting the system 
with arbitrary initial configurations, and must be able to evolve the system to arbitrary 
numbers of collisions into the future. Moreover, given that particles are marked with 
identifiers from f to N, the system state must be restored to the correct initial identifier 
assignation. 

An important requirement of the collision model is the uniform (unbiased) coverage 
of all available phase space. In particular, the scattering law upon each collision must 
uniformly sample the full range of restitution angles available to each pair of particles 
upon their collision. 

2.2 Collision Configurations and Constraints 

Here we consider collisions of three types: (1) single particle-wall collisions, (2) n- 
particlc-wall collisions, and (3) n-particle collisions (n > I). The hrst type is straightfor- 
ward to reverse. The second type is treated, without loss of generality, as a simultaneous 
set of individual wall-particle collisions, followed by n-particle collision (for any collisions 
that remain after the application of individual wall-particle collisions). The third is the 
more complex problem treated in the remainder of the paper. 

The collision of a particle with a wall is modeled by changing the sign of the velocity 
component that is orthogonal to the wall. If a particle touches more than one wall 
simultaneously at an edge or a corner of the box, all the appropriate velocity components 
change their sign. Other commonly used deterministic boundary conditions such as 
periodic wall boundaries [Lub91, Mar97a, Mar97b, Kra96] can be reversed analogously. 

For completeness, we list here the set of system configurations that are either trivial 
configurations that are not of interest, or are simple to solve separately without needing 
the complexity of our algorithm. 

• Trivial or degenerate situations, in which particles never come in contact. Exam- 
ples include all particles being at rest (zero kinetic energy), which is clearly an 
uninteresting case. 

• Situations in which particles do come in contact, but do not exchange momentum. 
Examples in the first category include initial conditions in which: 

— all particles are at rest, or 

— all N particles have velocities along one coordinate axis (say x) and the area of 
the particles' projections on the (y, z) plane is vN ® (this perpetually collides 
particles with walls but never results in any particle-particle collisions). 

Examples in the second category include grazing collisions, for which the line 
passing through the particles' centers at the encounter time is perpendicular to 
their (parallel or anti-parallel) velocities. This condition is easily detected and 
either ignored or a degenerate collision operator can be defined separately and 
applied. 
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2.3 Dynamics and Geometry 



In an n-particle collision, let V'i be the pre-collision velocity of particle i, and V. its post- 
collision velocity. For every pair of particles i and j that are in contact in the collision, 
let fji be the vector from center of particle j to that of particle i at the collision moment. 
Let the total momentum of the n colliding particles be M and their total kinetic energy 
be E. Then, the dynamics and geometry require the following: 



i=l i=l 

n n 



i=i 



i=i 



> Dynamics, 



(1) 



Vi, j such that particles 
i and j are in contact 



Fji ■ {V'i - V'j) < (pre-collision) 1 Qeometry (2) 
Tji ■ (V — Vj ■ ) > (post-collision) J 



In Equation (2), the strictness of the inequalities (< instead of <, and > instead of 
>) ensures that grazing collisions are excluded. 

Let d n — nd — d — 1 denote the number of degrees of freedom in the collision. 

Denote by $ = {</>fc|l < k < d n } the set of "parameters" that, given M and E, 
uniquely determines the set of velocities V = {Vi\l < i < n} of all particles in an 
n-particle collision. Let $ correspond to the parameters encoding the pre-collision 
velocities, and let $ (or, where ambiguous, $ ) encode the post-collision velocities. In 
2-particle collisions, the parameters correspond to geometrical angles of points on the 
surface of a <i„-sphere, whereas, in 3-particle collisions, the parameters are different from 
the geometrical angles of points on the surface of a (d n + l)-dimensional ellipsoid. 

In randomly sampling the phase space spanned by the d n parameters, at least d n 
random numbers must be generated per collision. Reversible random number generators 
are used to generate the needed random samples per collision, each sample value uni- 
formly distributed in [0, 1). The generators themselves require only a constant amount 
of memory. In practice, the memory per generator is often only a few bytes long. In 
theory, the memory only needs to be independent of the number of collisions being 
simulated. Additional considerations in random number generation that are critical to 
reversibility are discussed in Section 4.6. 

Our reversible collision algorithm uses the following mappings. 

V-to-<f>: Given V, this mapping determines the set of angles $ that uniquely deter- 
mines V . In forward execution, this mapping will be applied on pre-collision velocities 

V to obtain <f> . In reverse execution, this mapping function will be used to determine 
$ from the post-collision velocities V in a collision. 

<f>-to-V: Given the angles <I>, together with M and E, this mapping reconstructs 
all corresponding V. In forward execution, this mapping will be used to generate the 
post-collision velocities V based on reversibly computed values of $. The same mapping 
function will also be used in reverse execution to determine the pre-collision velocities 

V from the recovered value of $ of a collision. 

G: Let G = {Gfe|l < k < d n } denote the set of pseudo random numbers generated 
for each collision, where each Gk G [0, 1). The generator used to generate G is reversible, 
of high quality, and with a sufficiently long period. The generator can either be a single 
stream or contain d n independent, parallel streams. 
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G-to-Sl/: Further, let ^> = {tpk\^ < k < d n }, be the set of angle offsets generated 
from random numbers G. In both forward as well as reverse execution, the G-to-*$> 
function will be used as a deterministic mapping from uniform random numbers to 
angle offsets. The specific mapping function depends on n and d, but the function only 
depends on the (reversible) random numbers, total momenta and total energy; it does 
not depend on the individual velocities of particles in collision. 

In the remainder of the article, unless otherwise specified, all arithmetic on the angles 
will be performed modulo 2ir. 

2.4 Simplified Notation for 2 < n < 3, 1 < d < 3 

When dealing with collisions in which at most three particles are in contact with each 
other, in 1- or 2-dimensional space (and also in the case in which two particles are 
in contact with each other in 3-dimcnsional space), a simplified notation is used in the 
remaining of the article to refer to their velocities, momenta, and energy. The letters a-f 
will be used to refer to the components of the velocities along the x, y, and z directions. 
The letters a'-f will be used to refer to the corresponding pre-collision values of the 
velocity components. The letters a, /?, and 7 will be used as the sums of momenta along 
the x, y and z spatial directions respectively, and 8 will be used for total kinetic energy. 
Thus, for example, in a 2-dimensional, 2-particle collision, the post-collision equations 
will be written as a + b = a (total momentum in the x-direction), c + d = (3 (total 
momentum in the y-direction), and a 2 + b 2 + c 2 + d 2 = 5 (total energy), where a = V\ x , 
b = V2X, c = V\ y and d = V-^y are the velocities of the two particles in the x and y 
directions. In the remainder of the article, note that <5 > 0, since the system must have 
non-zero kinetic energy for collisions to occur. 

3 Skeleton of the Algorithm 

In any n-particle collision, the information available at hand during forward execution 
are the pre-collision velocities as well as the ranges of the angles that determine the 
range of permissible post-collision velocities. The pre-collision velocity configuration of 
all the n colliding particles can be uniquely encoded in terms of the total momentum, 
total energy, and the specific values of the free angles. Since all collisions are elastic, 
the total momentum and energy do not need any memory to recover. The problem thus 
reduces to that of developing a one-to-one mapping of pre-collision and post-collision 
angles, while still uniformly sampling all available phase space for the angles at every 
collision. 

In order to properly sample the available phase space, we use pseudo random numbers 
to select the angle values from the permissible ranges. The reversibility of the pseudo 
random number generators ensures the ability to go forward as well as backward in the 
random number sequence as needed, during forward and reverse execution, respectively. 

3.1 Reversible Collision Operation 

The problem reduces now to developing a collision algorithm that (a) takes d n pseudo 
random numbers, each uniformly distributed in [0,1), and gives d n reversible random 
angle offsets that satisfy the pre-collision phase space constraints, and (b) recovers the 
pre-collision angles from the random offsets recovered upon backward execution. 
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Once such a collision algorithm is developed, it can be used to uniquely recover 
the pre-collision velocities as follows. First, the random number sequence is reversed, 
thereby recovering the random numbers that were used in the forward execution. These 
are then used to recreate the random offsets that were used in the forward execution. 
The random offsets are applied in the opposite direction on the post-collision values of 
the free angles to uniquely recover the pre-collision values of the free angles, which in 
turn uniquely give the pre-collision velocities. 

3.2 Collision Sequences 

Collision sequences are modeled using standard techniques [Lub91, ML04, Kra96], by 
which, at every step of forward evolution, the time for next collision of each particle 
is determined, time is advanced to the earliest collision time, the particles undergoing 
collision are determined, their pre-collision velocities are transformed by our reversible 
algorithm to give post-collision velocities, and the process is repeated. 

In order to reverse the collision sequence, it is necessary to recover the most recent 
collision event in the past. That event is obtained by reversing the direction of all 
particles' velocities and employing the usual forward algorithm for determining the next 
earliest collision. The event represents information on the amount of time, dt, to go back 
in time, and the identities of the colliding particles. The system is then stepped back 
in time by dt units by changing the sign of the velocities of all particles, and linearly 
transporting them for dt units. At that moment, clearly, the colliding particles would be 
found to be in contact. The reverse collision algorithm is then applied on the particles 
in contact. This process is repeated iteratively until the time of interest in the past is 
reached. 

3.3 General and Specific Settings 

This general reversal scheme is applicable in principle to any values of n and d. However, 
the actual permissible ranges of the free angle values remain to be determined for each 
(n, d) pair. As illustrations, we develop the geometrical constraints for the subsets 
n < 3, d < 2 and n = 2, d = 3. While the development of the permissible angle ranges 
for n — 2 is relatively straightforward, those for n = 3 become complex starting even 
from d = 1. 

4 Algorithm Implementation 

Here, we determine the phase space of the permissible reversible random offsets that 
need to be sampled for 2-particle collisions in I, 2, and 3 dimensions, and for 3 particles 
in I and 2 dimension. 

4.1 Reversal for 2-Particle Collisions in 1 Dimension 

In a collision of 2 particles in I-dimensional space, as shown in Figure 1, let Pi be the 
particle on the right moving with velocity a, and P2 be the particle on the left moving 
with velocity b. The constraints on dynamics and post-collision geometry are: 
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± *± 

Figure 1: Canonical configuration of 2 particles in 1 dimension 



a 

a 2 + b 2 = 5,25 >a 2 ) 

Geometry. (4) 



> Dynamics, (3) 



r 2 i • (a - b) > 
for r 2 i = D > 



The pre-collision geometrical constraints are obtained by replacing > by < in 
the post-collision constraints. The system is fully defined without any free angles, giv- 
ing deterministic one-to-one mapping from pre-collision to post-collision velocities and 
hence reversibility is unambiguous. Nevertheless, this configuration is treated here for 
completeness. The equations of motion imply that: 

= « + VSEf? and 6 = « _ ^MZZ} Dynamics, (5) 

r 2 i ■ (a - b) = +r 2 i \J 25 - a 2 > } Geometry, (6) 
i.e., the dynamical solution satisfies the geometrical constraint. 

4.2 Reversal for 2-Particle Collisions in 2 Dimensions 

In a collision of 2 particles in 2-dimensional space, let a and c be the x and y velocity 
components of the particle with the smaller identifier, and b and d be the corresponding 
velocity components of the other particle. The equations of motion and post-collision 
geometry are: 



a + b = a, c + d = ft 
a 2 + b 2 + c 2 + d 2 = (5, and 25 > a 2 + /3 2 } 



> Dynamics, (7) 



r 21x - («-fc) + ^-(c-rf)>0l ometry 
for some r 2 \ x , r2i y > OJ 

The pre-collision geometrical constraints are obtained by replacing > by < in the 
post-collision constraints. The equations of motion imply that: 



(«-?) 



a + fc-fV = 2d "-<°, a+ ^W, 



i.e., the point (a, c) lies on a circle of radius R = \/25 — (a 2 + (3 2 + 7 2 )/2 centered at 
(a/2, (3/2), whose parametric equations are given by: 
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a = — + iicos^i , c = ^ + i?sin0i , and 0i e [0,2tt). (10) 

Equation (10) provides the V-to-<f> and &-to-V mapping functions for 2-dimensional 
2-particle collisions. Given (a', c', d'), a unique can be determined. Similarly, given 
(<pi, a, /3, 5), (a, 6, c, ef) can be uniquely determined. 




Figure 2: Configuration of 2 particle collision in 2 dimensions in the center-of-mass frame 

Given the pre-collision angle <fi 1 , the problem at hand is the generation of a reversible 
random offset ip from <p> 1 to give the post-collision angle <p\ — <p l + ip which can then be 
used to determine the post-collision velocities. Since the phase space of <p\ is [0,2-71"), it 
is necessary to sample the random offset ip also from the same range, in order to ensure 
full phase space coverage independent of <p 1 . This is illustrated in Figure 2 in which the 
colliding pair is visualized in its center-of-mass frame of reference (in this particular case 
of 2-particles in 2-dimensions, the geometric angle (pi indeed corresponds to the degree 
of freedom, but this does is not true in general). Moreover since the circumference of the 
circle is uniformly sampled by uniformly sampling the angle subtended at the center, it 
is sufficient to generate ip uniformly from to 2ir. Thus, ip g [0, 2tt) is generated with 
a G-to-*b mapping given by ip = 2Gn, with a uniformly distributed random number 

Ge[0,l). 

However, the <f>\ value thus computed may violate the geometry constraint Equa- 
tion (8) on post-collision velocities. If such a violation occurs with the generated random 
offset, the <p± value can be "corrected" by adding tt to it; the addition of tt guarantees sat- 
isfaction of the geometry constraint on post-collision velocities because of the following 
observations: 



Since (a — b) = 2R cos cpi , and (c — d) — 2R sin (p% , 

if r 2 ix • (a - b) + r 2 i y ■ (c - d) <0 

then r2ix ■ (2Rcoscpi) +T2i y ■ (2i?sin</>i) < 

=^> r 21x ■ (2i?cos(0i + tt)) + r 21y ■ (2i?sin(0i + tt)) > 0. 

Thus, the post-collision angle, using the random angle offset ip, is computed as either 
F : <pi = <pi + ip or F w : <f>\ — <p l + ip + tt. In reverse execution, the pre-collision angle <p 1 
is recovered as R : <p^ — <p\ — ip or R v : <p 1 = <p\ — ip — tt. If the fact that tt was added 
by the forward execution is somehow remembered (i.e., whether F or F n was applied), 
then the correct pre-collision angle can be recovered during reversal (by applying R or 
Rk respectively). 
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Our key observation here is that it is possible to avoid having to "remember" whether 
the 7r offset was added or not. If it was added in forward execution, then, in reverse 
execution, the fact that ip + it must be used, as opposed to ip, can be detected by a 
violation of the geometry constraint on the pre-collision velocities if it is not added. The 
correct value of 4> 1 = (pi — ip — it is thus possible to be recovered correctly. 

To adopt this approach, it is necessary and sufficient to show that no ambiguity exists 
during the reverse execution regarding which of F or F v was executed in the forward 
execution, so that the correct operation, R or R v , is applied for correct reversal. In 
other words, it is necessary and sufficient to satisfy the following conditions: 

. R(F(<P)) = </> . R w (F n ((/>)) - 4> . F{<f>) ± F n (<fi)) 

. R 7I (F(4>)) ? 4> . R(F^(4>)) (P . R(4>) ? R^)). 

These conditions can be proven as follows. Let CS((p) = r 2 i x • (2Rcos<p) + r 2 i y ■ 
(2Rsm<p). We know that F and F n are exclusive with respect to satisfaction of Equa- 
tion (8) with <f>i, i.e., F satisfies CS((pi) > 0, if and only if F^ satisfies CS{4>\) < 0. 
Similarly, R and R^ are exclusive with respect to satisfaction of Equation (8) with (p 1 , 
i.e., R satisfies CS(4> 1 ) < 0, if and only if R^ satisfies CS{4> 1 ) > 0. 

The geometrical constraints require that the post-collision velocities satisfy CS((f>i ) > 
and the pre-collision velocities satisfy CS'(0 1 ) < 0. These requirements are satisfied if 
R is used to reverse F, or R v is used to reverse F v , respectively. In other words, from 
Equation (11), we know that: 

• if F satisfies CS(4>i) > in forward, then R satisfies CS^^ < in reverse, and 

• if F n satisfies CS(4>i) > in forward, then R^ satisfies CS(4> 1 ) < in reverse. 
Finally, the ambiguity is fully resolved when the following are also satisfied: 

• If F satisfies CS(4>i) > 0, then R„ violates CS^'i) < °- 

• If F^ satisfies CS((f>i) > 0, then R violates CS{4> 1 ) < 0. 

The former can be proved as follows, and the latter can be proved along similar 
lines: Since CS^) < (pre-collision), and F gives CS((f>i) > (post-collision), R^ 

gives CS(0i - tp - it) = CS{<h + ip - ip - ir) = CS^ - it) = -CS {</>[). This implies 
CS'(0 1 ) = — CS{4> 1 ), which is a contradiction. 

The forward and reverse algorithms are given in Procedure 1 and Procedure 2 respec- 
tively. This completes the generation of reversible random offsets for all configurations 
of 2-particle collisions in 2 dimensions, ensuring full phase space coverage with zero 
memory overhead. 

Procedure 1 (</> 1 — > <pi): Forward Function for 2- Particle in 2 Dimensions 
l: ip 4— 2Gn {generate a random offset from [0, 2ir)} 

2: (pi (cp l + ip) mod 2ir {post-collision is pre-collision offset by randomized ip} 
3: {Next, if post-collision is converging, correct it to be diverging} 
4: if r 2 \ x ■ cos (pi + r 2 i y ■ sin^i < then 
5: (pi 4- ((pi + it) mod 2tt 
6: end if 
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Procedure 2 (0i — 5- 0i): Reverse Function for 2-Particle in 2 Dimensions 
l: ip <— 2Gir {recover the random offset} 

2: </>j 4— (4>i — tp) mod 2tt {initial guess at pre-collision angle} 
3: {Next, if pre-collision is diverging, correct it to be converging} 
4: if r 2 \ x ■ cos + r 2 i y ■ sin 0i > then 
5: 4> 1 4- (cj) 1 — 7r) mod 2tt 
6: end if 



4.3 Reversal for 2-Particle Collisions in 3 Dimensions 

In a collision of 2 particles in 3-dimensional space, the equations of motion and post- 
collision geometry are: 



a + b = a, c + d = (3, e + / = 7 
a 2 + b 2 + c 2 + d 2 + e 2 + f 2 = 6, and 26 > a 2 + [3 2 + 7 2 



Dynamics, (12) 
\ Geometry. (13) 



r 2 \ x ■ (a - b) + r 2 i y ■ (c - d) + r 2iz • (e - /) > 

for some r 2 i x , r 21y ,r 21z > J 

The pre-collision geometrical constraints are obtained by replacing > by < in the 
post-collision constraints. The equations of motion imply that: 



(°-iY+{c-l)'+(<-d 2 = 25 - {a T + ^ =»>- <"> 

i.e., the point (a, c, e) lies on a sphere of radius R = — (a 2 + (3 2 + j 2 ) centered 

at (2, 2). whose parametric equations are given by: 

q. (3 y 

a = — + i? sin 0i sin 02 , c = — + i?sin0i cos0 2 , e = — + Rcos<p\, 

A Zi A ( _LO ) 

01 e [0,tt], and 2 € [0,2vr) . 

Equation (15) provides the V-to-<I> and 4>-to- V mapping functions for 2-particle colli- 
sions in 3 dimensions. 

In determining $ from V , if <j>\ = or 0i = 7r, then, the value of 02 is immaterial. 
In that case, we choose to set 2 = 0. Setting 2 thus, without regard to 2 , loses 
information about 2 when 0i = 0. To deal with this rare special case, the value of 
2 can be logged in the forward collision, and restored from the log in the backward 
collision. Note that this is logged in forward execution only if 0i = 0, and not for every 
collision. In the reverse execution, 2 is recovered from the log only if 0i =0. 

For the G-to-* mapping, the offsets ipi and ip 2 are obtained by uniformly sampling 
the surface of a unit sphere. Any algorithm for this purpose can be employed (e.g., 
[Mar72]), using two random numbers G\ € [0, 1] and G 2 € [0, 1). 

Similar to the 2-dimensional 2-particle case, the post-collision angles are computed as 
01 = (0!+?Ai) mod 27ror0i = (0 1 +'0i+7r) mod 27r, and 2 = ((f> 2 +ip 2 ) mod 2tt. The 
choice of whether tt is added to ip is determined by the one that satisfies the geometrical 
condition given by Equation (13). The proof of reversibility of the computation of 0i is 
analogous to the one in the preceding sub-section. 
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This completes the generation of reversible random offsets for all configurations of 
2-particle collisions in three dimensions, ensuring full phase space coverage with zero 
memory overhead. 



4.4 Reversal for 3-Particle Collisions in 1 Dimension 

In this section, we determine the phase space of the permissible reversible random offsets 
that needs to be sampled for 3-particle collisions in 1 dimension. 



4.4.1 Solving the Equations of Motion 

The equations of motion are: 

a + b + c 



a 



+ b 2 + c 2 = 6,36 >a 2 j 



Dynamics. 



(16) 



The post-collision geometrical constraints are given by: 

Only two of these three need be 
satisfied for any given geometric 
configuration r 2 i, r 32 , r± 3 > 





(a- 


b) >0, 


r 3 2 


(b- 


c) >0, 


t-13 


(c- 


a) > 



(17) 



In Equation (17), when any two inequalities are satisfied, the third one is resulting. This 
is because the two inequalities that are satisfied imply a specific ordering of the three 
particles along the single dimension, which in turn automatically satisfies the remaining 
third constraint. The pre-collision geometrical constraints are obtained by replacing 
> by < in the post-collision constraints. The solution to the equations of motion 
satisfies: 



a 2 + 



= 6 , where a 

3 



V2 



and b - 



a + b 
~7T 



(18) 



V V3 7 

which leads to the parametrization: 

a = — = cos <pi , b = a H _ _ 

V2 3 

Converting to a, b and c, we get: 



sin A = V2 



6 - — , and <j>x 



e[0,27r). (19) 



a A. , 1 . a A, 

- + -(cos 0i + -^=sm0i), b= - + -(-cos 0i 



-^=sin0i), 



A 



c= - + -(--psin^i), A = V2\ 5- — , and 0i G [0,2tt 



-v2 



(20) 



3 



Equation (20) provides the V-to-<& and &-to-V mapping functions for 1-dimensional 
3-particle collisions. 
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4.4.2 Sampling the \I> Phase Space 



Analogously to the 2-particle 2-dimcnsional case, the phase space of the post-collision 
angle is sampled by generating a random fa as a random offset fa <E [0, 2w) from the 
pre-collision angle fa. The mapping function from a uniform random number G € [0, 1) 
to ipi is more complex than that for the 2-particle case. In the 2-particle case, the phase 
space for velocities lies on the circumference of a circle, which can be sampled simply by 
uniformly sampling the angle subtended at the center. However, the phase space of the 
velocities in the 3-particle case lies on the circumference of an ellipse, which cannot be 
sampled simply as ipi — 2Gir. Instead, any correctly unbiased procedure for generating 
the angle by sampling the circumference of an ellipse can be employed for the purpose 
of defining the G-to-* mapping function. One such method is described in the next 
section. 

4.4.3 Sampling the Circumference of the Ellipse 

Here, we present a new procedure for uniformly sampling a point from the perimeter 
of an ellipse. The procedure is designed to be suitable for use in reversible execution, 
requiring exactly one random number per sample. While rejection-based procedures 
exist for this problem, they cannot be used here due to their irreversibility, as explained 
later in Section 4.6. 



Figure 3: Sampling scheme for a uniformly selected random point on an ellipse 



Consider the ellipse (^j~J + \d~J = ^ shown in Figure 3. Let L : ip — > I be 

a function that maps any angle 1 ip, < ip < 2ir, to the length I of the arc moving 
counter-clockwise along the circumference of the ellipse from (d x ,0) to (r cos^rsin^), 
where r 2 = d x dy/(dy cosi/> 2 + d x simp 2 ). Let L' 1 : £ — > ip be the inverse function that 
determines the angle corresponding to any given arc length I. Thus, L^ X {L{^))) = ip. A 
random point on the circumference of the ellipse can be obtained as the end of the arc 
whose length is G ■ L(2n), and the angle corresponding to that point can be obtained as 
fa = L~ 1 {G ■ L(2tt)), which serves as the G-to-Sl/ mapping for the 3-particle collisions 
in 1 dimension. 

The value of L(2ir) is computable by different methods. For example, L(2ir) = ir(d x + 
d y ) X)^o {°n) h n ', where h = ^j 2 + d ^2 , which can be computed to any desired precision. 
The value of L~ 1 {t g i ven ) can be obtained by a bisection method (binary search) that 
starts with the lowerbound tpi OW er = upperbound ipupper — 2tt, and an initial estimate 
value of ipguess = 7T, and repeatedly adjusts the lowerbound or upperbound to the guess 

1 We will use the terms angle and parameter interchangeably, with the understanding that the angle 
is in fact the parameter in the representation of the point on the ellipse, and not the geometrical angle. 
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value (depending on whether L(ip guess ) < £ given or £ given < L(tp guess ) respectively), 
until the correct angle corresponding to £ g i ven is determined. 

For the ellipse in Equation (18), d x = \J 5 — ^ and d y — -^=. Since d x and d y 
only depend on S and a, the value of ^1 can be generated and recovered (re-generated) 
independently of individual particle velocities. 

Note that the computational burden in sampling the ellipse occurs both in log- 
based approaches and in our approach. Thus, the forward portions incur the same 
computational cost. However, we use the same computational procedure on the reverse 
path, to re-generate the sample and use it in the reverse procedure. Since log-based 
approaches rely on memory, they do not incur computational cost on the reverse path, 
but incur extra memory copying cost in forward execution. In a normal, well-balanced 
parallel execution, since reversals of collisions are far fewer than forward collisions, the 
extra computational cost of our approach in the reverse procedure is much less than the 
savings gained in toward execution compared to log-based approaches, resulting in an 
overall reduction in run time and memory. 



4.4.4 Resolving the Geometrical Constraints 

Note that, with the sampled = 4>\ + it is possible to correctly recover the pre- 
collision angle <p l , from which the values of the pre-collision velocities can be recovered, 
but not necessarily their correct assignation to the identities of the particles. Thus, 
there still remains the problem of uniquely recovering the configuration of the particles, 
since the preceding solution provides for two different but equivalent configurations that 
only differ in that their left and right particle identities are swapped. Without modi- 
fying the procedure, one additional bit of memory would be needed for each collision 
to disambiguate between the two configurations. Since the aim is to completely elimi- 
nate memory accumulation, the model needs additional development for reversibility, as 
presented next. 

Equation (20) gives the key terms in the geometrical constraints as: 

2ir 2ir 
(a — b) = Acos^i, (b — c) = Acos(0i — ), and (c — a) = A cos (cf>i + — ). (21) 

Equation (21) indicates the permissible phase space of <f>i with the property that 
we exploit here for reversibility, namely, that the three terms cos</>, cos(</> — ^ L ), and 
cos(</> + ^p) never carry the same sign, i.e., if one is negative, the other two are non- 
negative, and if one is positive, the other two are non-positive (see Figure 4). 

From Equation (21), we deduce: 

cos (j>i + cos {<f>i — ) + cos (</>i + — ) = 0, since A > 0. (22) 



Without loss of generality, let a , b , and c represent the pre-collision velocities of the 
left, center and right particles, respectively, colliding along the one-dimensional path. 
This is illustrated in Figure 5. Let the corresponding post-collision velocities be a, b, 
and c, respectively. Define a canonical assignation of (j> 1 such that a — b = Rcos4> 1} 
b — c = i?cos ((f> 1 — ^p), and c — a = Rcos (<f> 1 + ^-). Note that this convention fully 
covers the phase space of all pre-collision velocities for the given total momentum and 
kinetic energy, and in that sense, is general and equivalent to any other valid convention. 
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Figure 4: Variation of the geometrical terms with <pi in 3-particle collisions in 1 dimen- 
sion 




Figure 5: Canonical configuration of 3 particles in 1 dimension 



Pre-collision: For the left and center particles to collide, a — b > 0. Similarly, for 
the center and right particles to collide, b — c > 0. These two conditions constrain the 
range of 4> 1 to [-|, -|), since it is only for that range of <j) 1 that cos^ and cos {<j) 1 — ^p) 
are both non-negative. The exact value of <pi m that range is determined from the 
following: 



V2 



+ b 
V2 



and tan i 



V2, 



(23) 



Thus, any pre-collision configuration of (a , b , c ) velocities of left, center, and right 
particles can be uniquely and completely represented by a, 5), where ^ < 4> 1 < f . 

Post-collision: For the left and center particles to diverge after collision (i.e., not to 
pass through each other), their post-collision velocities must satisfy a — b < 0. Similarly, 
for the center and right particles, b — c < 0. These two conditions constrain the range 
of 0i to [-5r-, ^t), which is essentially offset by +ir from the range of <j) 1 . 

Pictorially, the regions of interest in the [0, 2tt) range are given in Figure 6. The full 
[0, 27r) range is divided into six regions, each spanning =?-. The first region starts at 
2. In the figure, Rl is the range of (p 1 (pre-collision), and SI is the range of <j>\ (post- 
collision). Any given angle tp l in Rl corresponds to a set of three velocities {a ,b , c }, 
such that the pre-collision geometrical constraints of Equation (17) are satisfied, implying 
a specifically ordered sequence of the particles, say, a; b; c, along one dimension. The 
regions R2 and R3 correspond to left-rotation of the Rl sequence, namely, c; a; b and 
6; c; a, obtained by offsetting (f) 1 by and — respectively. Similarly, the regions S2 
and S3 correspond to the right-rotation of the 51 sequence. Note that the pre-collision 
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and post-collision angles only fall in the Rl and 51 regions, respectively, and the other 
regions are unreachable by the system. They are defined and used in intermediate 
calculations when computing post-collision angles in forward execution (and recovering 
pre-collision angles in reverse) while ensuring full phase space coverage. 



K 

2 




in 
2 



Figure 6: Division of the angular space for 0i and 4>\ m 3- particle collisions in 1 dimen- 
sion 

Reversible Sampling: The problem of making the collision reversible now becomes 
equivalent to defining a reversible (i.e., one-to-one and onto) mapping from any given 
01 €[§,§) to a random sample of 0i £ [^-, 4^), while accurately preserving the under- 
lying distribution of velocities along the circumference of the ellipse in Equation (18). 
In general, any such bijection could serve the purpose. Here, such a mapping function 
is provided in Procedure 3 (forward) and Procedure 4 (reverse). 

The forward function essentially rotates (f>i uniformly over the entire range [0, 2n) 
first, and then makes adjustments reversibly, as needed, to map back to the valid range 

Procedure 3 (cf) 1 — > (f>i): Forward Function for 3-Particle in 1 Dimension 

I <— G ■ L(2ir) {Pick a random arc length on ellipse} 
■01 <— L _1 (f) {Find angle corresponding to arc length} 
{Compute post-collision angle} 
01 <- (</>! + -0i) mod 2ir {Step 1} 
{Adjust post-collision angle if/as necessary} 

if more than one of cos 0i, cos (0i — ^L) and cos (0i + is positive then 

01 <- (<Pi + tt) mod 2ir {Step 2} 
end if 

if cos 4>\ is positive then 

^1 <- (0i - if) mod 2tt {Step 3a} 
else if cos (</>i — 4f ) is positive then 

0i <- (0i + ?f) mod 2tt {Step 3b} 
end if 

The operation of Procedure 3 is illustrated in Equation (24). Step 1 adds a random 
offset in [0, 27r) to (f> 1 to get a candidate sample of 0i. Clearly, the candidate may fall 
in any of the six regions shown in Figure 6. The first correction is to detect if the 
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candidate angle maps to Rl, R2, or R3, and if so, remap it to SI, S2, or S3 respectively 
(because post-collision velocities must diverge, which implies that the post-collision free 
angle cannot be in Rl, R2, or R3). This is accomplished in Step 2 by adding tt to the 
candidate. In Step 3, if the candidate happens to already be in SI, then no additional 
adjustments are needed. Otherwise, if it falls in S3, it is wrapped back to SI by rotating 
it counter-clockwise by % (Step 3a), and if it falls in S2, it is wrapped back to SI by 
rotating it clockwise by ^ (Step 3b). To summarize: 

fa G {Rl} Stcpl > 0i G {51, 52, 53, Rl, R2, i?3} Stcp 2 > • • • 

randomize maps to 

• • • Step 2 ) 0i G {51, 52, 53} Step 3 ) 0i G {51} . (24) 

maps to maps to 

The reverse function shown in Procedure 4 uncovers the random offset first, and 
reconstructs a candidate fa. It detects adjustments, if any, that were made during 
forward execution, and then performs the opposite of adjustments to recover the correct 
original value of <f> 1 . 

Procedure 4 (fa — ► fa): Reverse Function for 3-Particle in 1 Dimension 

I 4— G ■ L(2ir) {Recover the random arc length on ellipse} 

fa 4— L~ l (l) {Recover the angle corresponding to arc length} 

{Compute initial guess of pre-collision angle} 

fa 4- (0i — fa) mod 2tt 

{Correct the guess if/as necessary} {Step 1} 

if more than one of cos0i, cos (0 X — ^) and cos (fa + ^f) is positive then 

01 4- (0i - 7r) mod 2tt {Step 2} 
end if 

if cos 0i is negative then 

4>i <~ {4>i - if) mod 27r { Ste P 3a l 
else if cos (fa — ^f ) is negative then 

4>\ <~ [<t>i + if) mod 27r {Step 3b} 
end if 

The operation of Procedure 4 is illustrated in Equation (25). Step 1 removes the 
random offset in [0, 27r) from 0i to get a first guess of the original fa. Clearly, the 
guessed value may fall in any of the six regions shown in Figure 6. The first correction 
to the guess is to detect if the candidate angle maps to SI, S2, or S3, and if so, remap it 
to Rl, R2, or R3 respectively (because pre-collision velocities must be converging, which 
implies that the pre-collision free angle cannot be in SI, S2, or S3). This is accomplished 
in Step 2 by subtracting tt from the guessed value. In Step 3, if the guess happens to 
already be in Rl, then it already represent the correct original value of fa. Otherwise, 
if it falls in R2, it is wrapped back to Rl by rotating it clockwise by ^ (Step 3a), and if 
it falls in R3, it is wrapped forward to Rl by rotating it counter-clockwise by -jf (Step 
3b). To summarize: 

0'i G {.Rl} ( Stcp 3 fa G {Rl, R2, R3} ^ Stcp 2 



maps to maps to 

( Stcp 2 0i G {51, 52, 53, J21, R2, R3} < — ^Ei 0i G {51} . (25) 

maps to dc-randomizc 
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4.4.5 Combining Dynamics and Geometry 



The series of transformations performed in forward and reverse collision operations is 
summarized in Equation (26), which shows the input of the three pre-collision velocities 
(a', b', d) being transformed by the V-to-3? mapping into the triple formed by the angle 
momentum a, and energy 6. These are fed into the forward parts of the Function 
1 or Function 2 with an additional input, which is the uniformly distributed random 
number G. The resulting triple of the post-collision angle <f>\ together with momentum 
and energy are transformed by the <I?-to-V~ mapping into the post-collision velocities 
(a, b, c). For reversal, the forward process is inverted by first recovering the angle 4>i upon 
applying the V~-to-<fr mapping on the post-collision velocities. The random number G is 
recovered by retracing the random number sequence, and the reverse part of the function 
is applied to recover <p 1 , from which the pre-collision velocities are obtained by applying 
the <&-to-V mapping on the recovered angle. 

I ' l! '\ V-to-* , r\ Forward , , rN *-to-V / i. \ 

(a,b,c) > {<t>i,a,6) > (<pi, £*,<>) > \a,b,c) 

? 

G (26) 

I 

i i i 3>-to-V / ,' ex Reverse / , rN V-to-* / 7 \ 

(a,b,c)i {<p l ,a,6)< (^) U a,d) i {a,b,c) 

4.5 Reversal for 3-Particle Collisions in 2 Dimensions 

In a collision of three particles in 2-dimensional space, the equations of motion are: 

a + b + c = a 
d + e + f = f3 
a 2 + b 2 + c 2 + d 2 + e 2 + .f 2 = 5 

3S>a 2 + f3 2 

The geometrical constraints satisfied by the post-collision velocities are: 

^2ix ■ (a — b) + r^iy ■ (d — e) > 0, if PI and P2 are in contact "1 

rs2x ■ (b — c) + r 32y ■ (e — /) > 0, if P2 and P3 are in contact > Geometry. (28) 

ri3x • (c — a) + ri3 y ■ (/ — d) > if P3 and PI are in contact I 

The pre-collision geometrical constraints are obtained by replacing > by < in the 
post-collision constraints. In Equation (28), we will use Kl to denote the first inequality 
(PI and P2 in contact), K2 to denote the second inequality, and K3 to denote the third. 

The solution to the equations of motion satisfies the hyper-ellipsoid equation: 



> Dynamics. (27) 



6-0 
1 



b-ga 



d-0 



cl — b - a + b T d— e 
where a — — — , b = — — , a = — — , and e 



1 

V 7s J 

d + e 



a 2 + /3 2 



(29) 



V2 



V2 



V2 



V2 ' 



18 



The hyper-ellipsoid can be described via parametic equations with three independent 
parameters {0i, 02, <fe} as the degrees of freedom as follows: 



A 



COS 01 . 



d = —= sin 0i cos 02 , 

V ^ 

- V2 A 

b= - a+ vm 

—p + — sin 
3 V2V3 



e = 



sin 0i sin 2 cos 03, 

'i sin 02 sin 3 , where 



Pi € [0,7T], 
> <fe € [0,7T], 

fee [0,2tt). 



(30) 



Based on the preceding parametric equations, the terms in the geometrical con- 
straints can be expressed as: 



a — 


b = 


A cos 0i 




d- 


e = 


A sin 0i cos 2 




b- 


c = 


— (Vs sin 0i sin 02 cos 03 


— COS 01 ) 


e — 


f = 


^ (VS sin 0i sin 02 sin 03 


— sin 0i cos 02) 


c — 


a = 


( V3 sin 0i sin 2 cos q 


!> 3 + COS 01 ) 


f~ 


d = 


(V3 sin 0i sin 02 sin (j. 


3 + sin 0i cos 02 ) 



(31) 



Note that sin0i and sin0 2 are always non-negative. 

The V-to-«I> and <&-to-V mappings are obtained from Equation (i 

V-to-<I>: Given a — /, the angles 0i, 02, and 03 are computed as: 



_! fa- b 



pi =cos 



h = < 



cos 



-1 



A 

if 0i = or 0i = 7T 

d-e \ , 

otherwise 

Asm 0i / 

= or 0i = 7T or 02 



(32) 



tion (31). 



if 0i = or 0i = 7T or 2 = or 2 = n 

h = { . _i /2(e- /) + Asin0icos0 2 \ J _. 

sin -= I otherwise 

V V3A sin 0i sin 02 / 

-to-V: Given a, /3, S, 0i, 2 , 03, the values of a — / arc computed from Equa- 
(31). 
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Note that the cases of 0i = or 0i — it or 2 — or 2 = ir are solved as follows: 



a 


A 


3 + 


9 


a 


A 


3 ~ 


2 


a 




3 




1 + 


A 




2 


/? 


A 


3 

/3 


2 


3 




(30) 


in 



a = — + — cos 0i 
■ cos 01 

z 

(33) 

6= 3 
/ = ? 



sider the case of < 0i < ir and < 02 < ?!"• Also, whenever 0i = or 0i = n, we set 
02 = and 03 = 0. Similarly, whenever 2 = or 2 = n, we set 3 = 0. 



4.5.1 Possible Geometries 

To make analysis easier, a notion of a "canonical configuration" is introduced for the 
three colliding particles in the 2-dimensional space. The canonical view is to align the 
x-axis with the line joining the centers of two particles in contact. The choice of the 
pair chosen for this line is designed to be recoverable in reverse execution, essentially 
making the choice of the pair only dependent on the geometry of collision, independent 
of the velocities of the particles undergoing the 3-particle collision. 

When the x-axis is aligned along such a pair of particles in contact, they can appear 
in one of four configurations shown in Figure 7. In all configurations, the horizontal line 
is chosen to be the line joining the two particles whose identifiers are smaller than that 
of the third one. The particle with the smallest identifier is always chosen as the particle 
on the left of the horizontal axis line. 

In configurations CI and C2, all three particles are in contact with each other, giving 
three pairs of particles in contact. In each of the rest, C3 and C4, only two pairs of 
particles are in contact. In all configurations, PI and P2 arc the left and right particles 
forming the horizontal axis. In CI, the third particles P3 is above the two horizontally 
placed particles, while, in C2, the third particle is below them. The configurations C3 
and C4 cover the rest of the possibilities in which only two pairs of the particles are 
in contact with each other at the same time. In C3, the third particle P3 is only in 
contact with particle P2, while, in C4, P3 is only in contact with particle PI. 

Once the canonical configuration is chosen, all the velocities are rotated to reorient 
to the new axes. Total momenta and energy undergo a resultant change that can be 
reversed to recover original momenta and energy values simply by rotating the axes back 
to original axes. Since the original velocities are in a one-to-one relation with the trans- 
formed velocities, it is the transformed velocities that will be considered as velocities 
a..f defined earlier for the 3-particle, 2-dimension collision model. After computing in- 
dividual post-collision velocities, they are rotated back to the original frame of reference, 
thus restoring the original total momenta and energy. 
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Figure 7: The canonical forms that we dehne for the four possible configurations in 
which three particles may undergo collision in 2 dimensions 



4.5.2 Sampling the \& Phase Space 

To generate a random set of offsets, a random sample point on the hyper-ellipsoid of 
Equation (29) is generated by invoking the numerical method given in Procedure 5 with 
s = 4, and ^Aj | 1 < i < 4} obtained by converting Equation (29) into the canonical 
form expected by the algorithm. The sampling approach is a generalized version of the 
approach employed for 3-particle collisions in 1 dimension (Section 4.4) to sample the 
phase space of 1 4 r . As mentioned earlier, although rejection-based methods are available 
for generating the samples, rejection is incompatible with reversibility, making them 
inapplicable here. 

Note that, by using one extra random number (i.e., using d n + 1 random numbers 
instead of <i„), Procedure 5 can be elegantly generalized, avoiding reliance on the separate 
algorithm of Section 4.4.3 for the special case of a 2-dimensional ellipsoid (ellipse). This 
can be achieved by iterating down to s = 1, introducing an extra parameter ?/) s , and using 
the additional random number to set ip s to either ^ or ^ (i.e., ixi = ±iAi). Using such 
a generalized algorithm for sampling the surface of an s-dimensional ellipsoid, s > 1, the 
algorithm in Section 4.4.3 can be replaced by a call to the generalized Procedure 5 with 
s = 2. However, to restrict the number of random numbers to d n , we customize the last 
iteration to use the optimized version given in Section 4.4.3. 

In the next sections, we derive the permissible ranges of the parameters, restricted 
from their full, nominal ranges due to conservation laws as well as geometrical con- 
straints, and develop the forward and reverse collision procedures based on the derived 
parameter ranges. 



4.5.3 Configuration CI 

From the geometry of CI (ref. Figure 8), it can be seen that r 2 i x = D, r 2 i y = 0, 



r 3 2y 



D, 



ri3x 



-D 



, and rizy — —^-D. Using these in Equation (28) 
and Equation (31) to account for the geometrical constraints, we obtain the ranges of 
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Procedure 5 (G — > \&): Generate the Parameters ^ of a Random Point on 
the Surface of an s-Dimensional Hyper-Ellipsoid, % s , using Random Numbers G = 
{Gi, . .., G s -i} 

1: Input: s, { s \i | 1 < i < s}, where integer s > 1, and X^i=i (~T~) = 1 is the 
hyper-ellipsoid 

2: Output: {ipi 1 < i < s}, where tpi are the parameters of 

a random point { r x\, . . . , r x s ) on the hypcr-cllipsoid, such that r xi = 
s Xi costpi Yij=i s'mtpj for all 1 < i < s, and r x s = s \ s YVj=i sm V'j 

3: for k — s down-to 2 do 

4: Let k = s - k + 1 

5: if k = 2 then 

6: Invoke the algorithm in Section 4.4.3 using Gj: to generate the parameter 
< V* < 27r corresponding to a random point on the perimeter of the (2- 
dimensional) ellipse V.2 

7: ^k^^P* 

8: else 

9: Compute the surface area Ak of Hk 

10: Compute the random fraction r Ak of Ak as: r Ak G-^ • Ak 

11: Using the bisection method analogous to that in Section 4.4.3, determine a 

< ip* < 7r such that surface area of the calotte of Hk defined by the latitudinal 

angle tp* from the pole equals r Ak 

12: ll>k^lP* 

13: Determine the parameters { s -iAi | 1 < i < s — 1} of the (s — l)-dimensional 
ellipsoid % s -i formed by the opening of the calotte, obtained by substituting 

14: end if 
15: end for 



f>i and (j>2 that are more constrained than in Equation (30). 
The inequality Kl directly gives the following: 

£>■ (A cos <£i) +()■(■■■) >0 



< cos^i > 

(34) 



2 — ti — 2 

=> < 4>i < f from Equation (30) 
Inequality K2 gives: 

^ • ^|(\/3sm0i sin 02 cos0 3 — cos^i)^) + 
-^Y^ • ^(\/3sm0i sin (/) 2 sin0 3 — sin^i cos</> 2 )^ > 0. 

or, — -\/3 sin 0i sin (f> 2 cos 3 + cos 4>i + 3 sin 0i sin 2 sin 3 — V3 sin 0i cos 2 > 0. 

(35) 
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Figure 8: Measures of interest in configuration CI 



Inequality K3 gives: 

= £- ■ (=£(^/%smfa sin 02 cos0 3 + cos0i)^ + 

■ (^(V3 sin fa sin fa sin fa + sin fa cos fa)) > 0. (36) 

or, v3sin0i sin0 2 cos fa + cos fa + 3sin0! sin fa sin0 3 + \/3 sin X cos fa > 0. 

From K2, we get: 

LI: cos(fa + f ) < 27 3 tan ; isinfe ~ jdnn- (37) 

Similarly, from K3, we get: 

L2: - cos(0 3 ~ f ) < 27 3 ta 4 sinfe + (38) 

To ensure a valid range for the left hand side in LI, the right hand side (RHS) of the 
same must not be less than —1. Setting the RHS to —1 defines the boundary between 
the possible and impossible regions, in terms of the relation between fa and fa. Similar 
restrictions arise from L2. These considerations give the limits on fa and fa as follows: 

fa = cot _1 (\/3cos02 - 2\/3sin02) (from LI, RHS=-1) 
0i= cot _1 (-V3cos0 2 - 2V3siafa) (from L2, RHS=-1). 



When < fa < fa — -| , fa is unrestricted in its range of [0,n]. When fa > fa = ~, the 

~7T ' 



lower- and upper bounds of fa are restricted, as determined next. Let r = cot / j >1 (giving 



r < 1 when ^ < fa < ^). Then, the lower bound 02, _ 02 is obtained by solving 

r = cos 02 — 2 sin 02, giving 2i = 2 tan -1 ( ~ 2+ ^'~ r ' i -). Similarly, the upper bound 

02 _ 02„ is obtained by solving r — — cos fa — 2 sin 02 , giving 02 u = 2 tan - 1 ( +2+ j ^ 5 r ~ r - ) . 

Also, the values of 0i and 02 could restrict the range of fa, whose limits are obtained 
by setting the RHS to unity. The restrictions on fa are obtained from: 

0i = cot -1 (V3cos0 2 + 2-\/3sin02) (from LI, RHS=1) 
0i= cot- 1 (-%/3cos0 2 + 2V3sinfa) (from L2, RHS=1). 

All the limiting curves on the angles are illustrated in Figure 9, which shows the 
space spanned by the nominal ranges of 0i € [0, -|] and 02 G [0, 7r]. The space is divided 
into six different regions that impose different constraints on the ranges of 0i, 02 and 
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03- The first two regions, labeled Rq and R~ , demarcate the regions excluded to make 
RHS> —1 (Equation (39)), one on each side of 02 = Q and 2 = n. In the region marked 
Rq , the range of 03 is restricted from below to be greater than 0, and in the region 
marked R% , the range of 03 is restricted from the above to be less than 2ir. In the region 
marked RZ, 03 is restricted from both below and above. The appropriate lower bound 
03, , and upper bound 03 u on 03 can be computed accordingly. In the region marked 
R\, 03 's original range of [0, 2n) is unrestricted. 

Thus, for CI, the ranges for post-collision parameters are: 

01 € [0,f) 

r [o,o] if ^i-o 

h e < [0, tt] if o < 0! < f 

[ [02,, </>2 J otherwise (i.e., f < 0i < f ) (41) 

![0, 0] if 0i = or 02 = or 02 = tt 
[03,,03„] if (01,02) falls in R$, R+ or RZ 
[0, 2tt) otherwise (i.e., (0i,02) falls in R+). 
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Figure 9: Regions in the nominal phase spaces of 0i and 02 demarcating different bounds 
of 0i, 02, and 03 in Configuration 1 of 3-particle collisions in 2 dimensions 



24 



Using a similar analysis, the pre-collision ranges can be obtained, as illustrated in 
Figure 9, which shows the range of cf> 1 <G (§, 7r], and corresponding constraints of 2 and 
2 in terms of the forward regions F ~, F~ , and so on. 

The forward and reverse procedures for this configuration are given in Procedure 6 
and Procedure 7 respectively. 

Procedure 6 ((0i,02j03) — ^ (01:02,03)): Forward Procedure for Configuration 1 of 
a S-Particle Collision in 2 dimensions 

1: Invoke Procedure 5 to generate the parameters {ipi,tp2,^3) of a random point on 
surface of the hyper-ellipsoid represented by Equation (29) 

2: 0i (01 - | + Vi) mod f 

3: 02 <~ (02 + "02) mod TT 

4: Compute 02, and 02„ {based on 0i} 
5: if 02 < 02, or 2u < 02 then 

6: 02 <— (02 + (02 ; + TT - 02„)) mod 7T 

7: end if 

8: Compute 03, and 03 u {based on 0i and 02} 

9: 03 (03 + ip 3 ) mod 2tt 
10: if 03 < 3; then 
11: 03 <- (03 + 3( ) mod 2tt 
12: else if 03 > 3u then 
13: 03 <r- (03 + (2tt - 3 J) mod 2tt 
14: end if 

Procedure 7 ((0i,0 2 ,03) — > (0i> 02> 03)) : Reverse Procedure for Configuration 1 of 

a 3-Particle Collision in 2 dimensions 

1: Invoke Procedure 5 to re-generate the parameters (^1,^2,^3) of the random point 
on surface of the hyper-ellipsoid represented by Equation (29), previously generated 
by Procedure 6 
2: Recompute 2 , and 2u {based on 0i} 
3: Recompute 3l and 3u {based on X and 2 } 

4: 0i <- f + ((01 -Vl) mod f) 
5: 02 (02 — "02) mod TT 

6: if 0^ < 02, or 02^ < 2 then 

7: 02 <- (0 2 - (02, + TT - 02„)) mod TT 

8: end if 

9: 3 <— (03 — -03) mod 2tt 
10: if 3 < 03, then 
11: 03 (03 — 03, ) mod 2tt 
12: else if 3 > 03 u then 
13: 03 <- (03 - (2tt - 3 J) mod 2tt 
14: end if 



4.5.4 Configuration C2 

Using algebra similar to that for CI, the ranges for configuration C2 are obtained, with 
swapped signs for r 32y and ri 3y . 
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4.5.5 Configuration C3 



Configuration C3 can be parameterized by an angle 9 that P3 makes relative to P2, as 
shown in Figure 10. 




Figure 10: Parametrization of configuration C3 by angle 9, ^ < 9 < 4^ 

In this configuration, r2i x — D, r^iy — (as was the case for CI and C2), but 
f32x = —Dcos9, rs2y = —DsiaO, and we are not concerned about r±3 X and ri3 y . Using 
these in Equation (28), we get the geometrically constrained ranges of <j)i, fa, and fa. 
Since the inquality Kl for this configuration is the same as for configurations CI and 
C2, the range of fa remains (0, However, only the inequality K2 applies to this 
configuration, and K3 need not apply. From K2, we get: 

—D cos 6^{-\f?> sin fa smfa cos 03 — cos fa) 
— D sin 6^ (v3 sin 0i sin fa sin 03 — sin fa cosfa) >0 . 

Since sin^i ^ and sin fa ^ 0, 

cos(03 — 9) v3 sin fa sin fa < cos 9 cos fa + sin 9 sin fa cos fa 
=> cos(fa — 9) < 7, where 

cos 9 cos fa- + sin 9 sin fa cos fa 

sin fa sin </> 2 (42) 
=> 03 e [03, + 6, 03„ + where 

< (03, + 9) mod 2tt < (0 3u + 9) mod 2?r < 2?r, and 
03, and 03 u are solutions to cos^ 1 7. 

Also, the range of fa may be constrained due to the requirement that — 1 < 7 < 1. 
Let a = ™s9 and v = Then, 7 = ii±il£2^2.. If»-z/>0oru + z/<0, 

^ V3tan0i V3 ' ' sin 02 ^ — ^ — ' 

then the range of fa is not constrained, giving the lowerbound fa l < fa equal to and 
upperbound fa < fa u equal to n. Otherwise, the lower bound (greater than 0) and 
upper bound (less than tt) of fa must be determined as follows. 

If fi — v < 0, the lowerbound of fa is obtained by solving for fa in — sin fa = 
[i + v cos fa. Since sin</>2 and /i + v cos fa intersect in at most two points within the 
range (0,7r) (see Figure 11), a unique lowerbound fa, , < 02, < 02 < 7r is obtainable. 
Similarly, if /i + > 0, then a unique upperbound 02„ , < 02 < 02 u < 7r is obtained by 
solving for fa in sin fa = fi + v cos 02 . 
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Figure 11: Illustration of lower and upper bounds of <p2 in configuration 3 of 3-particle 
collisions in 2 dimensions 



Thus, for C3, the ranges are: 

0ie[o,f) 



[0, 0] if 0i = 
h e { [0, ?r] if /i - f > or /i + v < 
[02j , 02 u ] otherwise 



63 e 



[0, 0] if 0i = or 02 = or 02 = TT 
63, + 9, 3u + 9] otherwise. 



The ranges may be derived similarly for pre-collision. The range of 4> 1 remains to be 
the same as in configuration 1 at (f ,7r], but, Equation (42) changes to cos (0 3 — 9) > 7. 



4.5.6 Configuration C4 

The treatment of configuration C4 proceeds similar to that for configuration C3, using 
K3 instead of K2. 



4.6 Random Number Generation 

A subtle but important consideration in zero-memory reversal is the need to ensure that 
the exact number of random number invocations is also recovered during reverse execu- 
tion without explicitly storing that information. In other words, the number of random 
numbers thrown, Gc, for any collision C must be determinable by the collision operator 
such that the random number sequence is correctly reinstated to the proper position 
corresponding to the pre-collision state. All our algorithms possess this property, with 
Gc = d n . 

In the case of 2-particle collisions in 1 dimension, no random numbers are needed, 
trivially satisfying the property. In the case of 2-particle collisions in 2 dimensions, 
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exactly one random number is generated per collision (Procedure 1), and hence the 
random number stream is stepped back by exactly one number during reversal of that 
collision (Procedure 2). 

The case of 2-particle collisions in 3 dimensions is slightly more complex, since it 
requires more than one random number to be generated. For forward collision, it appears 
possible to sometimes generate one random number and some other times two. Only 
one random number (let us denote it by G\) appears sufficient to be generated if that 
number happens to result in <f>\ — or <pi — ir. Similarly, two random numbers (let us 
denote them by G\ and G2) appear needed only to generate </>2 if the randomly generated 
(j>i (from Gi) is such that 4>\ ^ and <fri ^ ir. However, this conditional generation of 
one or two random numbers per collision creates difficulties during reversal, because, 
when the random number stream is reversed and the previous random number G is 
recovered, we will remain unsure whether G corresponds to G\ or G2. It is impossible to 
disambiguate between the two possibilities because both <pi and (f>2 may assume the value 
of zero. Hence, we would remain unsure if, in the forward collision, G\ was zero and 
hence G2 was not used for that collision, or if Gi happened to be non-zero and G2 = G 
happened to be zero. Similar ambiguity can be argued for the case of G ^ 0. Due 
to these considerations, we fix the number of random numbers generated per forward 
collision to be exactly two, unconditionally, so that the random number stream can be 
reversed exactly by two, restoring it to the correct pre-collision state. If Gi results in 
4>i = 0, G2 is still generated from the stream, but simply discarded by the collision 
algorithm. Assuming that the stream is random, the discarding of G2 when </>i = docs 
not affect the uniformity of the random samples. Note that the discarding is performed 
unconditionally, without any dependence or usage of the actual value of the discarded 
G2 in the forward model. This aspect of unconditional discarding is crucial for reversal, 
because the reversal also can determinstically reverse the random number stream. 

For 3-particle collisions in 1 dimension, exactly one random angle is required for 
every forward collision, and hence one reversal is necessary and sufficient in each reverse 
collision, ensuring correct reversal of the random stream. For 3-particle collisions in 2 
dimensions (Appendix 4.5), one, two, or three random angles are needed (depending 
on the geometry and pre-collision velocities) per forward collision. However, due to 
considerations similar to those for the case of 2-particle collisions in 3 dimensions, we 
use exactly three random numbers for every forward collision (even if only one or two of 
them may be sufficient in special cases of dynamics and geometry) in order to reverse 
the random number stream correctly. 

In general, exactly d n random numbers must be generated for every collision involving 
n particles in d dimensions. 

With regard to the number of distinct streams to employ in the simulation, it is 
possible to use one of the following three approaches: (1) a single random number 
stream for the entire system, or (2) N independent streams, corresponding to each 
particle in the system, or (3) d n independent streams for use in each collision. The 
first approach clearly requires a very high quality random number generator with a 
very long period in order to support a large number of collisions when N is large. The 
second approach requires relatively smaller periods per stream but also requires minimal 
correlation between streams. In any given collision, the random stream of the particle 
with the smallest identifer among the colliding particles can be used for that collision. 
The third approach can be used to sample exactly one random number per stream per 
collision. All approaches seem appropriate for reversal, depending on the modeler's 
specific needs regarding computational cost and the stream period. 
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Another context in which reversibility considerations of random number streams 
plays an important role is in generating random samples of *S . The G-to-ty function for 
sampling & involves sampling the circumference of an ellipse (or, in general, sampling the 
points on the surface of higher dimensional ellipsoids) . While rejection-based sampling 
procedures [PTVF07] are available for such problems, they cannot be used in reversible 
execution. This is because the number of (uniformly distributed) random numbers 
used by such rejection-based procedures varies with each sampled point, which makes it 
impossible to unambiguously reverse the random number stream without keeping track 
of how many random numbers were generated for each sample. In fact, rejection-based 
sampling can only be employed for certain special classes of probability distributions, 
whose parametric input does not vary across samples [PD09], whereas, in sampling ty, 
the parametric input varies with each collision. 

5 Implementation Results 

In order to test the performance of the algorithms, we implemented the algorithms in 
software using the CH — h programming language, and executed simulation experiments. 
The experiments are intended to test (1) the ability to restore the initial state after 
a sequence of many forward collisions followed by their reversals, and (2) the quality 
of phase space coverage discerned from the uniformity of generated velocity samples. 
For random number generation, we used a reversible version of a high quality linear 
congruential generator [LA97] with a period of 2 121 . A single generator stream is used 
for all the particles. In each configuration, the experiments verify successful reversal 
across several thousands of collisions. 

5.1 Collision Sequence Reversal 

Procedure 8 shows the pseudocode of the experiment program for 2-particle colli- 
sions in 2 dimensions. Procedure 9 shows the pseudocode of the experiment program 
for 3-particle collisions in 1 dimension. In both, the state of the particles is initialized to 
any desired intial configuration, followed by a sequence of N c applications of the forward 
collision operator. RNG(S) represents the generation of the next sample in [0, 1) from 
the random number stream (which also updates S as side-effect), while RNG -1 (S) 
represents the reversal of the most previous invocation to RNG and the recovery of the 
most recently generated value from RNG. After each forward collision, post-collision 
velocities are multiplied by —1 to give the new pre-collision velocities for the next col- 
lision. Since the post-collision velocities are guaranteed to be divergent, their opposites 
are guaranteed to give converging velocities that will result in the next collision. After 
all N c forward collisions, the entire sequence is reversed by executing the reverse colli- 
sion operator N c times. Clearly, the reversal is successful if the final velocities after all 
N c reversals recovers the initial velocities. This condition is verified at the end and the 
corresponding status message is printed. 

All executions terminated successfully with a "passed" status, verifying the restora- 
tion of initial state after reversal of N c collisions. The unbiased and correct phase space 
coverage is tested with N c up to 10 6 by plotting the post-collision angles against random 
angle offsets and pre-collision velocities. 

In the case of 3-particle collisions in 1 dimension, numerical integration to compute 
the segment length of an ellipse was performed using the Simpson's rule. Also, care was 
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Procedure 8 Reversal Illustration for 2-Particle Collisions in 2 Dimensions 
1: (a, b, c, d) <— (a , bo, c U; do) {initial velocities} 

2: a a + b, (3 <— c + d, S <— a 2 + b 2 + c 2 + d 2 {momenta and energy} 
3: S random number seed 

4: J^iir *~ 1) r 2it, 1 {normalized collision geometry} 

5: for i = 1 to N c do { Forward Execution } 

6: 4>\ ±- V-to-*(a, 6, c, rf) of Section 4.2 
7: G <— RNG(S) {Generate next random number in [0, 1)} 
8: (pi <— Apply Procedure 1 on <p 1 using G, r2i x , and r2i y 
9: (a, 6, c, ef) «- *-to-V (a, /3, <5, </>i) of Section 4.2 

10: a < a, 6 < 6, c < c, d i d {Reverse velocities to create next collision} 

11: end for 

12: for i — N c to 1 do { Reverse Execution } 

13: a i a, b i 6, c i c, d i d 

14: <j)i y-to-*(a, b, c, d) of Section 4.2 

15: G -s— RNG~ 1 (S) {Recover previous random number} 

16: <p 1 Apply Procedure 2 on 0i using G, r 2 ia;, and r 2 i y 

17: (a, 6, c, ef) <fr-to-V (a, /3, <5, ^i) of Section 4.2 

18: end for 

19: if a = a and b = b and c = c and d = c? then { Verification } 

20: print Passed' 
21: end if 



needed to account for numerical precision issues when using the numerically computed 
cosine function, which sometimes produces numerical noise for angles within very small 
neighborhoods of multiples of f and §. A tolerance of ±10 -8 around zero was employed 
when determining whether any given cosine value can be considered zero or positive. 

Figure 12 plots the post-collision angles against their corresponding pre-collision 
angles generated as part of an execution of 2-particle collisions in 2 dimensions, and 
plots the post-collision angles against their corresponding random angles in the same 
execution. Both plots show excellent uniformity and correctness of the sampled phase 
space. Similarly, Figure 13 shows the corresponding data for 3-particle collisions in 1 
dimension; again, they display excellent coverage. Since the covered areas in the plots 
become too dense for visual clarity when N c becomes large, only data for iV c = 10 4 are 
shown in the figures 2 . 

5.2 Randomness Tests 

The correctness of phase space coverage is experimentally verified by testing the ran- 
domness of the generated angles using statistical tests. The Diehard Battery of Tests 
[Mar95] for statistical verification of randomness is used for this purpose. 

The post-collision angles are mapped to double-precision numbers, each in [0,1), 
and the uniformity of their distribution is verified. This method is applied to 2-particle 
collisions in 2 dimensions, and to 3-particle collisions in 1 dimension. 

2 Figures for much larger number of collisions (N c = 10 s ) were also generated confirming similar 
uniformity of coverage and correctness of sampled phase space, but they result in much larger file sizes 
and hence not included here. 
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Procedure 9 Reversal Illustration for 3-Particle Collisions in 1 Dimension 

1: (a, b 1 c) <— (ao, bo, Co) {initial velocities} 

2: a <— a + b + c, S <— a 2 + 6 2 + c 2 {momenta and energy} 

3: 5 <s— random number seed 

4: for i = 1 to N c do { Forward Execution } 

5: 4\ <~ V-to-*(a, 6, c) of Section 4.4 

6: G RNG(S) {Generate next random number in [0, 1)} 
7: 0i Apply Procedure 3 on (p 1 using G 
8: (a, &, c) «- <&-to-V(a, <5, </>i) of Section 4.4 

9: a i a, b i b, a c {Reverse velocities to create next collision} 

10: end for 

11: for i — N c to 1 do { Reverse Execution } 

12: a < a, b i b, a c 

13: (pi <- V-to-*(a, 6, c) of Section 4.4 

14: G 4- RNG~ 1 (S) {Recover previous random number} 

15: 4> 1 <— Apply Procedure 4 on <\>\ using G 

16: (a, &, c) 4- &-to-V(a, S, </>[) of Section 4.4 

17: end for 

18: if a — a and b = b and c = c then { Verification } 

19: print Passed' 
20: end if 



For 2-particle collisions in 2 dimensions, each post-collision angle <j)\ is mapped to 
a double-precision number 77 € [0, 1) as 77 = (^i — k)/tt, where k — if < <f>i < ^, 
and k = 7r otherwise < </>i < 27r). Each 77 is converted into an integer equal to 
[i] x 4, 294, 967, 296J , and the resulting series of numbers (in hexadecimal format) is 
converted via the asc2bin program of Diehard to a binary-formatted hie given as input 
to the diehard program. 

For 3-particle collisions in 1 dimension, each post-collision angle <f>\ is mapped to a 
double-precision number r\ € [0, 1) as r\ = (<fii — — Each 77 is converted 

into an integer equal to \jj x 4, 294, 967, 296J , and the resulting series of numbers (in 
hexadecimal format) is converted via the asc2bin program of Diehard to a binary- 
formatted file given as input to the diehard program. 

Both Procedure 8 and Procedure 9 were successfully executed with N c — 3, 000, 000 
collisions, such that they terminate with a "Passed" result. The angles are logged to a 
file during their execution, and then used as input to the randomness tests. According 
to the Diehard tests, if the "p- values" computed and printed by diehard are observed to 
be strictly greater than and less than unity, randomness is understood to be satisfied 
[Mar95]. The p- values observed from the randomness tests on the generated angles are 
given in Table 1. Good phase space coverage via randomization is indicated by the fact 
that the p-values from several tests are significantly away from zero and unity. 

For 2-particle collisions in 2 dimensions, all tests in the Diehard repository were used 
and verified to generate very good randomness (positive p- values less than unity) without 
exception. This is due to the fact that no numerical precision effects are present in the 
collision algorithm for 2-particle collisions in 2 dimensions. For 3-particle collisions in 
1 dimension, all tests were used except those that operate on bit-level representations 
(such as the Birthday, Bitstream and Count-the-ls tests). This is because round off and 
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trunction effects in numerical integration, even at a relatively high precision of 10~ 9 in 
the computation of angles, introduces non-random patterns in the last few bits of the 
mantissa, appearing as non-randomness when selectively viewed in isolation or across 
multiple floating point numbers. However, when the angles are viewed as numbers 
themselves, uniform randomness is indeed observed, as expected. 

Table 1: Randomness indicator p- values from Diehard battery of tests 



Test 


Category 


p- value 






2-Dimension 


1-Dimension 






2-Particle 


3-Particle 


CRAPS 


Overall 


0.979800 


0.570270 




Wins 


0.790715 


0.862293 




Throws/game 


0.979797 


0.570273 


RUNS 


Set 1 - Up 


0.847070 


0.298998 




Set 1 - Down 


0.128221 


0.632054 




Set 2 - Up 


0.183069 


0.871807 




Set 2 - Down 


0.244909 


0.111657 


SUMS 


10 x^-tests on 100 x^-tests 


0.370145 


0.257959 


SQUEEZE 


42 degrees of freedom 


0.890107 


0.435410 


3DSPHERES 


X^-test on 20 p- values 


0.813199 


0.897844 


MINDIST 


X^-test on 100 min-distances 


0.348209 


0.751603 


PARKLOT 


X^-test on 10 p-values 


0.947200 


0.720233 



6 Performance Estimation 

To estimate the performance gain that can be expected by resorting to reversible col- 
lisions instead of state saving, we implemented a simulation of a sequence of 2-particle 
collisions in a closed system of N particles in d = 2 dimensions. Experiments were run 
with N = 1,000, N = 10,000, and N = 100,000 particles. Starting with random ini- 
tial configuration of positions and velocities, and randomly selected inter-collision times, 
particle motion is simulated between collisions, and, at every collision point, the collision 
operator is applied on a random pair of particles. With state saving, the system state 
is saved to memory before every collision, to be able to roll back to that state. With 
reverse computation, no state is saved, as the system can be rolled back perfectly to any 
point in the past by reverse computation alone, without reliance on memory. 

Two platforms with different computational and memory characteristics are tested: 
one with traditional central processing unit (CPU), and the other with newer graphical 
processing unit (GPU). Modern CPUs now have much higher computational speeds than 
memory speeds; the differential between computational and memory speeds is even more 
pronounced in modern GPU platforms [PFS05] . Implementation on the CPU is realized 
in the C++ programming language, and that on the GPU is in the CUDA programming 
language [SK10]. The CPU is an AMD Opteron 6174 processor with 64 GB of memory. 
The GPU is a high-end nVidia Geforce GTX 580 (Fermi) accelerator with 512 CUDA 
cores and 3 GB device memory. Compilation systems used were gec 4.4.5 and CUDA 
4.1. 

In each simulation run, N c = 1000 collisions were simulated, and, after iV c collisions, 
the system was rolled back to the beginning. With reverse computation, the positions 
and velocities are verified to match the initial conditions exactly (to within at least 
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Figure 12: Sampling for iV c = 10 4 collisions in 2-particle collisions in 2 dimensions 
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Figure 13: Sampling for N c = 10 4 collisions in 3-particle collisions in 1 dimension 
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e = ±10 9 ), while, with state saving, the results are trivially exactly matched. 

The results are drawn as stacked histograms in Figure 14, with the total height of each 
bar representing the total time for forward execution of N c collisions and their reversal, 
which is split into Forward Time and Rollback Time in milliseconds. Three variants 
are benchmarked: SSI represents the state saving mechanism in which all state is saved 
(positions and velocities); SS2 represents an optimized state saving mechanism in which 
the positions are saved and only the four components of the pre-collision velocities of the 
colliding pair are saved; RC represents the reverse computation with no memory. For 
each variant, the suffix -CPU represents execution on the CPU, and -GPU represents 
execution on the GPU. 



CPU-based implementation 



CPU-based implementation 



(a) 



Rollback Time (ms) 
Forward Time (ms) 



Number of particles — 1,000 

80 



(b) 
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Figure 14: Performance of state saving and reverse computation with N c = 1000 colli- 
sions 
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Total run time with reverse computation is lower across the board. As expected, the 
greatest differential is observed in Figure 14(f) for the GPU runs with the largest number 
of particles (N = 100,000). Since the ratio of memory transfer cost to computational 
cost is much higher with the GPU, reverse computation runs much faster, while state 
saving incurs the high memory transfer cost for every collision. Also, the GPU platform 
is known to be extremely efficient with large vectorized codes such as this simulation 
(i.e., more particles can be simulated with little increase in total time, if memory bottle 
neck is relieved). Hence, the reverse computation runs are extremely fast on the GPU, 
compared to all CPU runs and also compared to state saving with GPU. On smaller 
number of particles (N — 1000 and TV = 10,000), CPU runs in (a) and (c) are faster 
than GPU runs in (b) and (d) because of larger CPU caches, yet, even in this case of 
relatively lower memory cost, reverse computation is observed to run faster. Even more 
importantly, when the number of particles is further increased (e.g., N > 1 million), 
state saving becomes infeasible due to memory limitations, but reverse computation 
runs well even at such large scale. Similarly, the benefit of using reversible simulation 
only increases with increase in the number of collisions, due to corresponding increase in 
the memory needs of state saving. A more detailed analysis of the memory subsystem 
behavior (e.g., data cache misses at levels 1 and 2, and translation lookaside buffer 
metrics) for each of these runs is part of our planned future work. 

7 Summary 

The classical problem of simulating elastic collisions of hard spheres has been revisited, 
with the important additional requirement of reversibility. Although classical simula- 
tion of elastic collisions has been well studied in the literature, little has been known on 
how to simulate them reversibly with minimal memory overhead. Here, we formalized 
the problem in terms of accurate phase space coverage specification and geometrical 
constraints. We solved the problem by developing a general framework that combines 
reversible pseudo random number generation with new mapping functions, geometri- 
cal constraints, and reversal semantics. While previous log-based approaches require 
memory proportional to the number of collisions, our algorithms incur essentially zero 
memory overheads and also ensure correct phase space coverage. We developed the 
detailed steps for 2-particle collisions (up to 3 dimensions) and 3-particle collisions (up 
to 2 dimensions). In these configurations, memory overhead is exactly zero for colli- 
sions in which d n — 1, and essentially zero for collisions with d n > 1. In the latter 
configurations, {</> i+1 , • • • )<^d n } are logged if and only if fa — 0, for any 1 < i < d n . 
Generalizations to collisions among larger number of particles and at higher dimensions 
are tedious, but can be carried out if needed. At higher dimensions and with larger 
number of particles, computationally expensive numerical integration becomes neces- 
sary in both classical (forward-only) approaches as well as in the forward procedures 
of our reversible method. To meet the goal of minimal memory overhead, our reverse 
procedures rely on numerical integration as well, whereas log-based reversal approaches 
would use memory to save information and avoid recomputation in the reverse path. 
In a normal, well-balanced parallel execution, reversals of collisions are far fewer than 
forward collisions (i.e., dependency violations are incurred infrequently). In such cases, 
our reversible models are more efficient, since forward cost for reversibility is eliminated, 
whereas log-based approaches would incur log-generation costs in forward execution. 
Finally, although reversibility of elastic collisions is a seemingly simple problem to 
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formulate, it includes sufficient complexity to make it challenging. The new approach 
and results presented here offer insights in revisiting additional classical physical system 
models with the added requirement of reversibility and exploring their fundamental 
memory characteristics and limits. 
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